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ABSTRACT 

Marked statistics allow sensitive tests of how galaxy properties correlate with envi- 
ronment, as well as of how correlations between galaxy properties are affected by 
environment. A halo-model description of marked correlations is developed, which 
incorporates the effects which arise from the facts that typical galaxy marks (e.g., 
luminosity, color, star formation rate, stellar mass) depend on the mass of the par- 
ent halo, and that massive haloes extend to larger radii and populate denser regions. 
Comparison with measured marked statistics in semi-analytic galaxy formation models 
shows good agreement on scales smaller than a Megaparsec, and excellent agreement 
on larger scales. The halo-model description shows clearly that the behaviour of some 
low-order marked statistics on these scales encodes information about the mean galaxy 
mark as a function of halo mass, but is insensitive to mark-gradients within haloes. 
Higher-order statistics encode information about higher order moments of the distri- 
bution of marks within haloes. This information is obtained without ever having to 
identify haloes or clusters in the galaxy distribution. On scales smaller than a Mega- 
parsec, the halo-model calculation shows that marked statistics allow sensitive tests of 
whether or not central galaxies in haloes are a special population. A prescription for 
including more general mark-gradients in the halo-model description is also provided. 
The formalism developed here is particularly well-suited to interpretation of marked 
statistics in astrophysical datasets, because it is phrased in the same language that is 
currently used to interpret more standard measures of galaxy clustering. 

Key words: galaxies: formation - galaxies: haloes - dark matter - large scale structure 
of the universe 



1 INTRODUCTION 

Almost all clustering analyses to date treat galaxies as 
points without attributes. However, galaxies have luminosi- 
ties, sizes, shapes, velocity dispersions, star formation rates, 
etc. Recent work (Hamilton 1988; Norberg et al. 2002; Ze- 
havi et al. 2005) has begun to study how galaxy correlations 
depend on luminosity and color — the more luminous galax- 
ies are more strongly clustered, and red galaxies tend to clus- 
ter more strongly than blue. However, the quality of the data 
is now sufficiently good that one can imagine measuring, not 
just galaxy clustering as a function of galaxy attribute, but 
the spatial correlations of the attributes themselves. That is 
to say, rather than measuring clustering as a function of lu- 
minosity, one can now measure the clustering of luminosity 
(or of color, star-formation rate etc.). 

Borner, Mo & Zhao (1989) were among early pioneers, 
studying the correlation functions of galaxies with different 
weightings according to luminosity and mass. Although also 
discussed by Peebles (1980), this sort of approach has been 



E-mail: sliethrlc@physics.upenn.edu 



formalized under the framework of marked point processes 
(e.g. Stoyan 1984; Stoyan & Stoyan 1994). Marked statistics 
have recently been applied to astrophysical datasets by Beis- 
bart & Kerscher (2000), Beisbart, Kerscher & Mecke (2002), 
Gottlober et al. (2002) and Faltenbacher et al. (2002). 

Marked statistics provide a useful framework for de- 
scribing point processes in which the points have attributes 
or weights. They are particularly well-suited to identifying 
and quantifying correlations between galaxy properties (lu- 
minosities, colors,, stellar masses, star formation rates) and 
their environments (e.g. Sheth, Connolly & Skibba 2005), 
particularly when such correlations are weak (Sheth & Tor- 
men 2004). The halo model (reviewed in Cooray & Sheth 
2002) is the framework within which traditional (i.e. un- 
marked) measurements of galaxy clustering are currently 
interpretted (e.g., Magliochetti & Porciani 2003, Mo et al. 
2004; Zehavi et al. 2005; CoUister & Lahav 2005). This paper 
develops the halo-model description of marked statistics. 

Section |5| defines a number of marked statistics. Sec- 
tion |3] provides a halo model calculation of these marked 
statistics, under the assumption that marks do not correlate 
with spatial position within haloes. The analysis extends 
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ideas presented in Sheth, Abbas & Skibba (2004). It then 
compares the halo-model description with measurements in 
simulations. Section0]shows how the halo-model description 
can be extended to allow for correlations between marks and 
position within halo — mark gradients. It pays special atten- 
tion to the case in which the central object in a halo is 
different from all the others. The analysis shows how the 
halo model can be used to test simple physical models of 
why galaxy properties correlate with environment. On larger 
scales, the halo-model description of marked statistics shows 
that they can be thought of as being linearly biased versions 
of unweighted statistics — this is the subject of Section |S] A 
final section discusses how the methods presented here pro- 
vide the basis for interpretating measurements of marked 
statistics which can be made with databases currently avail- 
able. It also shows how marked statistics can be used to 
interpret measurements which indicate that correlations be- 
tween galaxy properties (e.g., the correlation between stellar 
mass and A'-band luminosity) also correlate with environ- 
ment. An Appendix illustrates some of the key ideas using 
a fully analytic toy model. 



2 MARKED STATISTICS 

In what follows, a mark is a weight or attribute associated 
with each point in a point process. To make the discussion 
less abstract, we will often use astrophysical terms to illus- 
trate our arguments. Thus, a point process is a galaxy cat- 
alog, and a mark can be any observable property associated 
with a galaxy, such as luminosity, color, velocity dispersion, 
size, star formation rate, etc. Marked statistics measure the 
clustering of marks. Since the positions at which the marks 
are measured may themselves be clustered, marked statistics 
are defined in a way which accounts for this. 

For example, let p denote the mean density of particles, 
and let w denote the mean mark, averaged over all particles. 
Now consider a particle with mark larger than this mean 
value. Are the particles neighbouring it also likely to have 
larger marks? One way to quantify this is to compute the 
ratio of the mean mark to w of pairs of particles as a function 
of pair separation. The typical number of pairs at separation 
r is p'^[i+£,{r)], where ^ is the two point correlation function. 
Therefore, the mean mark is 



Mi{r) = 



^[w(a;) + w{y)]I{\x - y\ - r) 
Xi[w(a^) +w{y)]I{\x -y\-r) 



where I{x) — unless a; = 0, and the sum is over all galaxy 
pairs. We have divided by w, so Mi{r) — 1 for all r if 
there are no correlations between marks. Analogously, the 
nth-order mark is defined by 

E[™(a;)+™(y)]"T(ix-yi-r) 



M„ir) = 



(2) 



{2w)" p^[l + ^{r)] 
For what follows, it is also useful to define 

r - E[w(a:) - wjy)]" I{\x - y\ - r) 
^"^ '^ {2w)- p^l + ar)] • ^ ' 

It is sufficiently straightforward to generalize these concepts 
of n-th order marks from pairs to A'^-tuples that we have not 
written the expressions explicitly. 



In what follows, we will mainly study the cases when 
n = 1 and 2. In this regard, it is helpful to re- write AI2 as 

Y.l'^i^) - w{y)f I{\x -y\-r) 



M2{r) 



+ 



{2wY p2[i + ^(r)] 

Y,w{x)w{y)I{\x ~ y\ 
w2 p2[i + ^(.r)] 



(4) 



The second term involves a similar sum to that which de- 
fines 1 -I- ^, except that now each particle of the pair con- 
tributes a weight w. Thus, the second term can be thought 
of as a 'weighted' correlation function. If we write it as 
[1 + W^W]/[1 + C(0] then 



M2(r) =C2(r) + 



W{r) 



(5) 



i+c(0 ■ 

When n = 2, it is perhaps more intuitive to study the mark 
variance and covariance, defined by 



and 



var(r) = M2(r) - Ml{r)+C2{r) 



cov(r-) = M2{r) - Mi{r) - C2{r). 



(6) 



(7) 



To help build intuition, it is perhaps useful to consider 
how one might estimate these marked statistics in a data 
set. Consider the quantity 1 -|- ^(r) which appears in the de- 
nominator of all the expressions above. A common estimator 
for it is to simply sum the number of data pairs with sepa- 
ration r in the point distribution and divide it by the num- 
ber of pairs of similar separation in a random distribution. 
The suggestive notation for this estimator is DD/RR. Now 
consider the quantity (1-1- W)/{1 + (,)■ Similarly suggestive 
notation for the estimator of 1 + W{r) is WW/RR. How- 
ever, since we are interested in the ratio of these two terms, 
the appropriate estimator is WW/ DD. Note that DD is 
precisely the term in the sum in the denominator of equa- 
tion Since WW/ DD is simply the average over all pairs 
in the sample of the product of the weights, it can be es- 
timated without explicitly constructing a random catalog, 
and without explicitly worrying about the survey geometry. 
Similarly simple estimators for the other marked statistics 
defined above can also be constructed (e.g., A4i(r) can be es- 
timated as WD/DD), making them far less time-consuming 
to estimate than the usual unweighted statistics such as ^. 



3 THE HALO MODEL DESCRIPTION 

This section describes how the marked statistics defined 
above can be written in the language of the halo model. 
The analysis below complements and extends ideas in Sheth, 
Abbas & Skibba (2004). 

In the halo model (Cooray & Sheth 2002, and refer- 
ences therein), the nonlinear density field is assumed to be 
made up of dense objects called haloes. At any given time, 
haloes of different masses all have the same density (they 
are all approximately two hundred times denser than the 
background). All mass is in such haloes, and so all galaxies 
are also associated with haloes. 

In this description, the two-point correlation function 
^(r) is determined by the sum of two types of galaxy pairs: 
pairs in the same halo, and pairs in separate haloes. Since 
the radius of a typical halo at z = is less than a Mpc, the 
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onc-halo term is negligible on scales larger than a few Mpc. 
On the small scales where the one-halo term dominates, the 
shape of ^(r) is determined by how halo density profiles 
depend on halo mass, and on how halo abundances depend 
on mass; on larger scales, ^(r) is loss sensitive to the shapes 
of halo profiles, and more sensitive to the clustering of the 
haloes themselves. 

For this description, it may help to think of the galaxy 
distribution as a density field, in which case 



[w{x) + w{x + r)]" p{x)p{x + r) 
Mnir) = ^ ^ ^, (8) 



(2w)"- (^p(x)p(x + r)^ 
where the angle brackets denote averages over all space. 

3.1 Unweighted statistics 

In the halo model, all mass is bound up in dark matter 
haloes which have a range of masses. Hence, the density of 

galaxies is 



f dn{m) 
rigai = / dm gi{m), 



(9) 



where dn{m)/dm denotes the number density of haloes of 
mass m, and 

5„(m) = N{N - 1)...{N -n + l)p{N\m) (10) 



is the n-th factorial moment of the distribution p{N\m) of 
galaxies in m-halocs. If p{N\m) follows a Poisson distribu- 
tion, then g'n(m) = g"(m). 

The correlation function is the Fourier transform of the 
power spectrum P{k): 



, , I dk k'^P{k) s'mkr 
?(^) " ' ~k 27r2 k7~' 



(11) 



In the halo model, P{k) is written as the sum of two terms: 
one that arises from particles within the same halo and dom- 
inates on small scales (the 1-halo term), and the other from 
particles in different haloes which dominates on larger scales 
(the 2-halo term). Namely, 



P{k)=Pih{k)+P2h{k), 



(12) 



where 
Pih{k) 
P2h{k) 

PLin(fc) 



= / 



dm 



dn{m) g2{m)u{k\mY 



dm 



"gal 



dn(m) ni(m)u(k\m) , . . 
dm — ^ — - ^-^ — - — ^ — - b(m) 



dm 



Here u{k\m) is the Fourier transform of the halo density pro- 
file divided by m, b{m) is the bias factor which describes the 
strength of halo clustering, and PLin(A;) is the power spec- 
trum of the mass in linear theory. When explicit calculations 
arc made, wo assume that the density profiles of haloes have 
the form described by Navarro et al. (1996), and that halo 
abundances and clustering are described by the parameter- 
ization of Sheth & Tormen (1999). 



3.2 Marked statistics when marks are 
independent of position within halo 

Now consider the weights. Let p{w\N,r,m)dw denote the 
probability that the A'^ galaxies at positions (ri, . . . , rjv) in 
an m-halo have weights w = (wi, . . . ,wn)- 

As our simplest model, we will consider the case in 
which the weights do not depend on position within the 
halo. If, in addition, those weights arc independent, then 
p{w\N, m) dw = HiLi p{wi\N, m) dwt. If the distribution of 
weights depends on m but is independent of N, then this 
simplifies further to 



p{w\N, m) dw = Y\ p{wi\m) dwi. 



(13) 



Note that this model assumes that the weight associated 
with one galaxy is independent of the others within a halo, 
but that the distribution of weights depends on the mass 
of the parent halo. Later we will compare this model with 
one in which the distribution of weights depends on distance 
from the centre of the parent halo, but is otherwise indepen- 
dent of the other objects in the halo. 

The mean weight associated with galaxies in m-haloes 

is 



{w\m) 



N 

n/ 

i=l 



dwip{wi\m)- 



N 



-I 



The mean weight averaged over all haloes is 
dn{m) {'w\m) gi{m) 



If we define 



p{w) = j 



dm 



Wgal 



dwp{w\m) w. 

(14) 

(15) 



dn(m) gi(m) , , , 
dm : ' ^ } ' p{w\m) 



dm 



Wgal 



(16) 



then 



- I 



J dwp{w) w" 

dn{m) gi{m) 



dm ■ 



dm 



dm ngai 
dn{m) gi{m) 
dm naai 



J dwp{w\m) w" 



(17) 



The marked statistics defined in the previous section 
require averages over particle pairs. So, for instance, 

1 + Wi{r) 



where Wi (r) is the Fourier transform of 

Wi{k)=Wl''{k)+w!''{k), 



(18) 



(19) 



with 
Wl^ik) 

Wf^(fc) 
PLi^k) 



J 



dm 



I 



dm 



dn{m) {w\m) g2{m) \u(k\m)f 
dm 

dn{m) 



dm 



. , (w\m) g\{jn) , . , 
h(m)- — zr-'- ^-rr — - u[k\m) 
w rigal 



/ 



dn{m) s ffi(m) 
dm — — - bim)^ — -u{k\m) 
dm riaai 



and ^(r) was defined earlier. Note the similarity between the 
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integrals which define Pih and P2h, and those for Wi'' and 

Similarly, we can write 

[1 + {w^)/iB^]/2 + W2{r) 



where VV2(r) is the Fourier transform of 



(20) 



(21) 



with 



PLi„(fc) 



/ 



dm 



dn{m) {u)'^]m) + {w\m)'^ 
dm 2w^ 

g2{m) \u{k\m)\^ 



dn{m) , 
dm — ; o(m^ 



'gal 

{w\m) gi{m) 



+ 



dm 



dm 

dn{m) 
dm 



V2; 



u(k\m) 



b{m] 



{w^\m) gi{m) 



2w^ 



^^gal 



u{k\m) 



dn{m) sQiim) 
dm — — - b{m)^ — - u(k\m) 
dm Jigai 

If we define the variance of the weights in m-haloes as 



and set 



then 



Wi'^(fc) 



Pu4k) 



V'^{w\m) = {w'^\m) — {w|m)^, 



(22) 
(23) 



dm 



dn(m) (uijm)^ 92(1^) \u{k\m) 



dm 



^gal 



+ / dm 



dn{m) V'^{w\m) g2{m) \u{k\m)f 



dm 



dm 

dn{m) 
dm 



2w^ 



"-gal 



, , , (wjm) gi(m) , , 
b{m) ^ L ' ' u{k\m) 



+ 



W Hgal 
..2 



dm^bim)^-^'-^u{k\m) 

2yj2 



dm 



JJgal 



dnijn) -,gi{m) 
dm — — - b(m)-^ — - u(k\m) 



dm 



rlgal 



dm^^b{m)^^'-^u{k\m) 



dm 



V2n 



Jlgal 



Let W{k) denote the sum of the first term of W2'' with the 
first term of Wf''. Then the Fourier transform of W{k) is the 
weighted correlation function W{r) (e.g. insert this expres- 
sion in equation 1201 and compare with equation (HJ . These 
expressions show that Mi{r) and the weighted correlation 
function W{r) encode information about the first moment of 
p(w|m); information about the scatter around {io|m) comes 
from M2{r). 

If the distribution of weights, p{w\m), does not de- 
pend on m, then {w\m) — w, so VVi(fc) — P{k), and 
Mi{r) = 1. Similarly, W2{k) = [(1 + {w'^} /w'^)/2] P(k), 
and so M2{r) = (1 + {w^}/w^)/2. This is sensible: if the 
distribution of weights is independent of m, then A^2(f) is 
simply the average value of a quantity which is the square 



of the sum of two random variates divided by 4w . Thus, if 
p{w\m) does not depend on m, the marked correlations are 
constants, independent of scale, and they have the values 
associated with truly independent marks. 

However, the expressions above show that marked 
statistics can have non-trivial scale-dependence if p{w\m) 
depends on m, even though galaxy marks do not depend on 
position with the parent halo, and the mark of one galaxy 
is otherwise independent of the marks associated with the 
others. That is, galaxy marks are only correlated with the 
masses of their parent haloes; all other correlations between 
galaxy marks are a consequence of this correlation. In such 
a model, the small-scale dependence of marked correlations 
is a consequence of the fact that the size of a halo depends 
on its mass. On larger scales, the dominant cause of nontriv- 
ial scale-dependence of the weighted correlation function is 
that the spatial distribution of haloes is mass-dependent. 

Notice that the two halo contribution to A4i{r) and 
to the weighted correlation function W{r) depend on the 
combination gi{m) (wlm); this quantity is the sum of the 
marks in a halo, averaged over all halos of mass m — the 
mean total mark in m-haloes. This shows that the large 
scale behaviour of these two statistics encodes information 
about how this quantity depends on halo mass. Note that 
this information is provided without ever actually divid- 
ing the galaxy distribution up into clusters. Furthermore, 
if the number of galaxies in a halo follows a Poisson distri- 
bution, then g2{m) — gi{m)'^, and the one-halo contribution 
to W{r) encodes information about the square of this quan- 
tity. 



3.3 Comparison with simulations 

Before building a more sophisticated model, it is worth 
checking how well this simple description fares when com- 
pared with marked statistics measured in models of galaxy 
formation. The GIF semi-analytic galaxy formation mod- 
els of Kaufi^mann et al. (1999) provide a useful testbed for 
the halo model description developed above. Measurements 
of marked statistics for a variety of marks in these simu- 
lations have been presented in Sheth, Connolly & Skibba 
(2005). We study some of them here. In all cases, the mea- 
surements are for a sample of galaxies which contain more 
than 2 x 10^°h~^MQ in stars at 2 = 0.2 in a fiat ACDM cos- 
mology with {flQ,h,as) — (0.3,0.7,0.9). The redshift was 
chosen to approximately match the median redshift of the 
2dFGRS (CoUess et al. 2001) and SDSS (York et al. 2000; 
Abazajian et al. 2003) surveys. The mock galaxy catalog 
contains about 14,665 objects in a cubical comoving volume 
141/i-^Mpc on a side. 

The halo-model calculation requires knowledge of the 
first and second factorial moments of the galaxy counts in 
haloes. Figure shows how these quantities scale with halo 
mass. The smooth curves show 



/ N f mil\ -lO/mi, , , -{mil 

gi(m — e ' + 1 — e ^ '-' 

' V 250/ 



/IS)" 



= (250)'^ +°-''°"H"Tooo-;^j' 

where mn denotes the halo mass in units of lO"/i"^M0. 

In addition, calculation of marked statistics requires 
how the mean weight or mark depends on halo mass. Fig- 
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Figure 1. First and second factorial moments of the distribu- 
tion of the number of galaxies with stellar mass greater than 
2 X lO^"^ /i-lM0 in haloes of mass M in the GIF ACDM semi- 
analytic galaxy formation models at z = 0.2. 



ure 13 shows this mass-dependence for a variety of weights: 
open triangles, filled triangles, squares, crosses, circles and 
stars show how the mean Lb, Lv, Lj, Lk, stellar mass, 
and star formation rate depend on halo mass. Most of these 
marks are steeply increasing functions of halo mass, at least 
in the range 2 x 10^^ < M/h'^MQ < 2 x lO". At larger 
masses, the mean luminosity- weights are approximately con- 
stant, but depend strongly on waveband — on average, galax- 
ies in massive haloes are more luminous than average, al- 
though this over-luminosity is larger in the redder bands. 
For a given weight, these trends with mass give rise to non- 
trivial scale dependence of the marked statistics. The dif- 
ferent mass dependence of the weights makes the marked 
statistics depend on the type of mark. 

Figure|3|illustrates these differences using the luminosi- 
ties in the reddest and bluest bands as the marks. Results 
for the two marked statistics which depend only on the mean 
mark within haloes are shown: the symbols in the top pan- 
els show Mi{r) in the simulations when the B- (left) and 
if-band (right) luminosities are used as the mark; symbols 
in the bottom panels show the ratio of the weighted and 
unweighted pair counts (1 -I- W)/{1 + ^). 

In all panels, the dotted lines show the result of ran- 
domizing the marks, and then repeating the measurement 
of the statistic one hundred times. The mean of these ran- 
dom realizations is shown (it is virtually indistinguishable 
from unity) bracketed by the rms scatter around it. This 
gives a rough indication of the typical uncertainty on the 
measurement (this estimated uncertainty assumes uncorre- 
lated marks, so it is almost certainly an underestimate of 
the true error on the measurement). 

Note the non-trivial scale dependence of the statistics 
in each panel, and note that the scale-dependence is very 
different in the two bands. Close pairs tend to be more lu- 
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Figure 2. Mean mark as a function of parent halo mass in the 
GIF semi-analytic galaxy formation models at z = 0.2, for a vari- 
ety of marks. While differences between the marks are relatively 
small in the range 10^^ < M/h~^ Mq < lO^^, more massive 
galaxies are more luminous than the mean in the redder bands. 
The mass dependence of any given mark gives rise to nontrivial 
scale dependence of the associated marked statistics. 



minous than average in K, but less luminous than average 
in B. 

The smooth dashed curves in the different panels show 
the result of inserting gi{m), g2{m) and {W|m) from the 
simulations (c.f. Figures and |5J in the halo model formu- 
lae given earlier. (In practice, we approximate the two-halo 
terms using the simpler expressions given in Section|5]) Re- 
call that all scale dependence in these calculations is the 
result of the fact that massive haloes extend to larger radii 
and populate denser regions, and that the mean weight de- 
pends on halo mass. There are no additional environmental 
effects, and there are no correlations between luminosity and 
position with the halo. Comparison with the symbols shows 
excellent agreement on scales larger than 2/i~^Mpc, suggest- 
ing that the analytic calculation has captured the essence of 
the physics at large separations. On smaller scales, however, 
there are differences, particularly for the weighted correla- 
tion functions shown in the bottom panels. The agreement 
on the larger scales which are dominated by the two-halo 
term is reassuring, because it suggests that modification to 
the one-halo term is all that is necessary to describe the 
statistics. 

Figure 0] shows similar results, but now when the star 
formation rate is used as the mark. As when the luminosity 
was the mark, the halo model calculation provides a rea- 
sonable description of the marked statistics on scales larger 
than a few Mpc, but it significantly over-predicts the sig- 
nal on small scales. In the next section, we argue that most 
of this discrepancy arises from the fact that, although the 
model allows for the possibility that marks may depend on 
halo mass, it does not allow marks to depend on position 
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Figure 3. Marked statistics in wiiicii B- and .ft'-band lumi- 
nosities were used as tlie mark. Symbols show measurements 
in the GIF semi-analytic galaxy formation models, and dotted 
curves show an estimate of the uncertainty on the measurement. 
Dashed curves show the halo-model calculation developed earlier, 
in which there is no distinction between the central galaxy and all 
the others in a halo. Solid curves show the halo-model calculation 
described in Section l4.ll in which central galaxies are special. 

within a halo. Thus, comparison of the model curves with 
the measurements provides some indication of the impor- 
tance of such mark-gradients. Evidently, such gradients only 
matter on small scales; this is sensible, since one does not 
expect the detailed distribution of marks within a halo to 
affect measurements on scales which are significantly larger 
than that of a typical halo. 

The next section shows how the halo model can be ex- 
tended to include mark-gradients, thus allowing one to ad- 
dress the question of what causes these gradients. In par- 
ticular, we show that allowing the central object in a halo 
to be diflerent from all the others accounts for most of the 
discrepancy on small scales. 
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Figure 4. Marked statistics in which star formation rate was 
used as the mark. Symbols show measurements in the GIF semi- 
analytic galaxy formation models, and dotted curves show an esti- 
mate of the uncertainty on the measurement. Dashed curves show 
the halo-model calculation developed earlier, and solid curves 
show the halo-model calculation developed in Section l4.ll 



4 MARKED STATISTICS WHEN MARKS 
DEPEND ON POSITION WITHIN HALO 

This section provides two simple parametrizations of the 
effects of a correlation between galaxy mark and position 
within the halo. In the first model, this correlation is par- 
ticularly simple: the central galaxy in a halo is supposed 
to be different from all the others, but, other than this, all 
the previous assumptions about the independence of marks 
apply. This case, while simple, is a standard assumption in 
semi-analytic and SPH-based galaxy formation models (e.g. 
Kauffmann et al. 1999; Zheng et al. 2005). It is also pre- 
cisely the approximation currently used to interpret mea- 
surements of the luminosity dependence of galaxy clustering. 
The second model allows for more sophisticated correlations 
between galaxy mark and position within the halo; it may 



be useful in studies where the mark is galaxy color, since 
redder galaxies in a halo are expected to be more centrally 
concentrated than the bluer ones. 

However, in neither model is the mark of one galaxy 
within a halo physically correlated with that of another: 
the correlation is purely statistical. For instance, Zheng et 
al. (2005) find that, in their semi-analytic models, there is 
a weak correlation between the number and luminosity of 
satellite galaxies in less massive haloes and the luminosity 
of the central galaxy: both are smaller if the central galaxy 
is more luminous. Such a correlation is not present in the 
models developed below. One signature of such a physical 
correlation would be a successful description of A4i{r) even 
on small scales, but gross discrepancies between model and 
measured (1 -I- W)/{1 + ^). Since we see discrepancies in 
both the upper and lower panels of Figures 01 and m this 
is less of an immediate concern. In any case, accounting for 
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Figure 5. Mean mark as a function of parent halo mass in the 
GIF semi-analytic galaxy formation models at z = 0.2, for a va- 
riety of marks. Panel on the left shows the mean mark for the 
central galaxy in a halo, and panel on the right shows the mean 
mark for the other galaxies. It is interesting to compare both these 
panels with Figure l2l 



this correlation is more complicated, and will be reported 
elsewhere. 



4.1 The centre-satellite model 

Figure |5| shows how the mean mark depends on halo mass. 
This mean value was computed by averaging over all the 
galaxies in a halo, whatever their location within it. How- 
ever, in the GIF models, the central galaxy in a halo is very 
different from the others. To illustrate. Figure |^ shows the 
same marks as in Figure ^ but now the marks associated 
with the central galaxy (left panel) are shown separately 
from those associated with the other galaxies (right panel). 
Clearly, the mass-dependence of the marks is very different 
in the two cases. The halo model calculations of the previ- 
ous section showed that mass dependence of any given mark 
gives rise to nontrivial scale dependence of the associated 
marked statistics. Hence, it is possible that the failure of 
the halo model calculation on small scales (in the previous 
section) was due to the neglect of this difference between 
central and satellite objects. In this respect, the model must 
be extended to allow for a correlation between the value of 
the mark and its location. 

To include this effect, assume that haloes which host 
galaxies host one and only one central galaxy, and possibly 
many non-central galaxies. We will sometimes refer to these 
other galaxies as satellites. Let gT'^{m) denote the fraction 
of m-haloes which host a central galaxy, and let g^^{m) 
denote the nth factorial moment of the distribution of the 
number of satellites in a halo. Further, define 



gTik\m) = gT'^irn) + gT^{m) u{k\m) 



(24) 



and 



gT'^im) (wccnlm) + gT*-{m) {wsat|m) u{k\m) ^ 



(25) 

these are the analogues of the mean number times density 
profile, and mark-weighted number times density profile. 



Currently popular (centre plus Poisson satellite) models (e.g. 
Kravtsov et al. 2004) have gi™"(m) — 1 for m greater than 
some minimum mass, gl°'^ — for smaller m, gl^^{m) = if 
ffi(m) < 1, and gr{m) = [5^(^)1^ 

Since there can only be one central galaxy, the un- 
weighted correlation function ^(r) is the Fourier transform 
of the sum of 



PLin(fc) 



dm 



dn{m) 
dm 



2ffr(m)5r(m)M(fc|m) 



+ 



"gal 

grim)u{k\mf 



lb , 

gal 

, dn{m) gfik\m) , , , ^ 
dm — — - ^ _ ' — - h{m) 



dm 



Wgal 



The first term in P^h represents the contribution from 
centre-satellite-, and the second from satellite-satellite- 
pairs. (The density run of central galaxies around their host 
haloes is a delta-function.) 

Similarly, the halo model estimate of M\{r) requires 
evaluation of 



yvl^ik) 



dm 



dn{m) {wccn\m) + {ws^t\m} 
dm 2w 
^ 2gr{m)gT\m)u{k\m) 



+ 



dm 



dn{m) (wsatlm) *(m) K(fc|m) 
dm w ni. 



wr(fc) 

PLi„(fc) 



dm — r-^ b{m)— — ^^-^ — - 



dm 



".p-al 



, dn(m) , , , of (fclm) 
dm — — - b(m) ^ _ ' — - 



dm 



Wgal 



And the Fourier transform of the weighted correlation func- 
tion becomes 

dn{m) {wccn\m) {wsat|m) 



Wrnik) 



dm ■ 



dm w w 
2ffr(m)ffr(m)it(fcim) 



'gal 



-I- / dm 



dn{m) (lUaatlm)^ gl'^^ (m) u(k\m) 



dm 



'gal 



W2H{k) 
PLin(fc) 



dn{m) , gT"{k\m) 
dm — — - b{m) — — — - 



dm 



n-gal 



The solid curves in Figures |21 and |1] show these halo model 
calculations; they are in substantially better agreement with 
the measurements than the dashed curves. (In practice, we 
approximate the two-halo terms by the simpler expressions 
given in Section]^) 

It is easy to see why this happens. Consider, for exam- 
ple, the 7^-band luminosity. Figure |S] shows that the central 
object is usually substantially more luminous than the satel- 
lites, especially at higher masses. Moreover, the satellites 
are also less luminous than when we assign them weights in 
which the central object is not treated as special, as in Fig- 
ure|5| When the central object is special, then pairs with sep- 
arations of order the diameter of a typical halo will be dom- 
inated by the satellite-satellite term. Since this has smaller 
weights than when the centre was not special, the resulting 



8 R. K. Sheth 




1 10 1 10 1 10 1 10 

r/h^^Mpc r/h"'Mpc r/h^^Mpc r/h"'Mpc 



Figure 6. Marked statistics in which /-band luminosities were 
used as the mark. Symbols show measurements in the GIF semi- 
analytic galaxy formation models, and dotted curves show an es- 
timate of the uncertainty on the measurement. SoUd curves show 
the full halo-model calculation developed in Section l4.1l in which 
central galaxies are special. The two dashed curves in each panel 
show the one- and two-halo contributions to the statistic, and the 
dotted curves show the centre-satellite and the satellite-satellite 
contributions to the one-halo term. The centre-satellite term dom- 
inates on the smallest scales. 



values of A4i and (1 + W)/{1 + i^) are smaller. Similar con- 
sideration of the differences between mean satellite weights 
when the central object is and is not special explains the 
qualitative differences between the solid and dashed curves 
in Figures 13 and |1] 

Figure El provides an explicit demonstration of the rela- 
tive roles played by the various terms in the model when the 
mark is 7-band luminosity. The panel on the left shows re- 
sults for All and the panel on the right shows (1-|-W^)/(1+^). 
The symbols show the measured values, and the band 
around unity traced out by the dotted lines shows an es- 
timate of the uncertainty on the measurements calculated 
by randomizing the marks (as for the previous figures) . The 
solid curve shows the full marked correlation; the two short 
dashed curves show the one- and two-halo contributions, and 
the two dotted curves show the centre-satellite (dominates 
on small scales) and satellite-satellite contributions to the 
one-halo term. 

It is worth emphasizing that, in Figures |3 and 0] the 
mean mark in m-haloes is the same function of m for both 
the solid and the dashed curves — the only difference is in the 
physical interpretation of this mean mark. The solid curves 
represent a model in which the central galaxy in a halo is 
different from all the others, whereas the dashed curves show 
the expected marked statistics if the central galaxy were 
not special. Thus, our analysis shows that marked statistics 
are well-suited to discriminating between different physical 
models of galaxy properties. 

The previous plot shows how different physical models 
of the marks within halos result in different marked statis- 
tics. For completeness. Figure |7| shows how the dependence 
of the mean mark on halo mass affects the statistic. From 
bottom to top, the different curves show the relative con- 
tribution from halos with masses greater than 10^* Mq, 



Figure 7. Marked statistics in which /-band luminosities were 
used as the mark, shown as a function of the halo mass range 
which contributes. Symbols show the same measured values as 
in the previous figure, and solid curves show the same full halo- 
model calculation. The three dashed curves in each panel show the 
fractional contributions of the term in the numerator of the halo- 
model expression for the marked correlation function from halos 
more massive than lO^^h-'^MQ (bottom), lO^^h'^MQ (middle), 
and 1012-5 /i-IMq (top). 



lO"/i"^M0, and lO^^ '^/i'^Mo (meaning that, for the 1-1-^ 
term in the denominator, the integrals were performed over 
the entire range of halo masses, but that they were restricted 
to masses greater than these values when the numerator 
was computed). The full signal (solid curves) is very well 
approximated by the signal from halos more massive than 
as one might expect from a glance at Figure 
These curves indicate that the small scale signal is domi- 
nated by halos with masses around 10^^h~^MQ and greater, 
but that on larger scales, the contribution from less massive 
halos is more significant than that of more massive halos. 



4.2 A more general case 

This section provides a simple parametrization of the effects 
of a correlation between galaxy mark and position within 
the halo. We continue to assume that there are otherwise no 
correlations between the marks of one galaxy and another. 
Specifically, assume that there is a deterministic relation 
between distance from the halo centre and the value of the 
mark: w(r|m). (For instance, suppose some galaxy property 
depends on the local density or velocity dispersion within 
the parent halo. In reality there is almost certainly scatter 
in this relation; we think of w{r\m) as an average value). 
Then 



p{w\m) dw — p{r\m) dr = 47rr^ ! — ^ dr. 



(26) 



for some monotonic relation w{r\m), and so 

{w\m) — J dw p{'w\m) w = J dr p{r\m) w{r\Tn). (27) 



If we define 



J dr Anr^ w{r\m) p{r\m) sm{kr)/kr 
f dr 'Inr'^ w{r\m) p{r\m) ' 



(28) 



The halo-model description of marked statistics 9 



then this quantity is the normalized Fourier transform of the 
weighted density profile. 

Note that although there is no scatter in the marks at 
fixed r, there is scatter in the marks at fixed pair separation 
(the two members of a pair of fixed separation can come 
from different distances from the halo center). Thus, in this 
model, the variance in weights at fixed pair separation is 
non-trivial. 

The marked statistic A4 1 (r) is now given by terms like 
dn(m) {w\m) g2{rri) 'w(k\rn)u{k\'m) 



>vr(fc) 

PLi„(fc) 



dm ■ 



dm 



dm 

dn{m) 
dm 



him) ^ _ ' ' w{k\m) 



dm'^i^b{m)3l^u{k\m) 



dm 



in effect, the fact that the weight now has a profile means 
that one must replace one power of u{k\m) with w{k\m). 
Similarly, W{k), the Fourier transform of the weighted cor- 
relation function, becomes W\h{k) + W2h{k) where 



Wihik) = 

W2H{k) 



dm 



dn{m) (w\m)^ g2{m) \w{k\m)Y 



dm 



'gal 



dn{m) {w\m) gi{m) 
dm — — - bim)- — ^ — - w(k\m) 



dm 



".gal 



In this case, both powers of u{k\m) have been replaced. 

Note in particular that the effect of mark-gradients is 
expected to be more dramatic for (1 -I- W)/{1 + (,) than it 
is for A4i: W\h requires two powers of u{k\m) w{k\m), 
whereeis Wih only requires one. Thus, the analysis above 
indicates that incorporating weight-gradients in the halo- 
model description is relatively straightforward. Appendix 1X1 
illustrates these effects using a fully analytic toy model of 
the gradients. We expect this model to be useful for studying 
color gradients in clusters. 

If there are true correlations between the weights of one 
galaxy and others in the same halo (such as the weak corre- 
lation reported by Zheng et al. 2005) , then the expression for 
Wih(k) becomes more complicated still. The analysis above 
suggests that |«j(fc|m)|^ in the integrand for W\h should be 
replaced with a term which accounts for the correlation be- 
tween the marks, as well as the shape of the density profile; 
this is the subject of work in progess. 



5 BIASING ON LARGE SCALES 

On scales which are larger than the diameter of a typical 
halo, the marked statistics above are dominated by the con- 
tribution from pairs in separate haloes. In this limit, the 
scale dependence of marked statistics is simply related to the 
shape of the linear theory power spectrum. To see why, con- 
sider the unweighted correlation function ^(r), the weighted 
correlation function W{r) and the additive marked statistic 
Miij). On large scales, the Fourier transforms of halo pro- 
files u{k\m) 1. Hence, gf{k\m) gj™ + qT = gi(rn) 
and g^'^^(k\m) gi(m){w\m} /w. If we define 



Dgal 



dm 



dn{m) giijn) 



dm 



n-gal 



b{m) 



(29) 



and 



then 



P2h{,k) 

>vr(fc) 

W2h{k) 



dm 



dn(m) gi{m) (wlm) 



dm 



"gal 



fe(m) 



6gal -PLin(fc), 
&gal biul -PLin(fc), 
b-wl Phinik), 



and 



(30) 



(31) 
(32) 
(33) 



on large scales. Thus, suitably defined combinations of ^, 
and A4i provide measurements of b^i/bga^i. 

In practice, measurement of ^ requires use of a random 
catalog as well as knowledge of the survey boundary, whereas 
measurements of marked statistics do not (c.f. discussion 
at the end of Section I^Jl. Hence, the most straightforward 
measurement of 6,„i/6gai comes from using the fact that, on 
large scales 



[l + iy(r)]/[l + C(r)] 
Mi{r) - 1 



1 + (34) 



Dgal 



The previous sections showed that, on large scales, the halo 
model calculation is in excellent agreement with the sim- 
ulations. Hence, our analysis shows that marked correla- 
tions allow a simple measurement of this ratio. It is also 
straightforward to estimate the relative bias factors associ- 
ated with two different weights: simply measure &u,i/bgai for 
each weight, and then take the ratio. 



6 DISCUSSION 

A standard assumption in semi-analytic galaxy formation 
models is that all galaxy properties are determined by the 
formation histories of their parent haloes, which, in turn, de- 
pend on halo mass. Thus, correlations between galaxy prop- 
erties and environment are primarily driven by the correla- 
tion between halo mass and environment. It is these correla- 
tions which marked statistics are well-suited to quantifying. 
The halo-model expressions for marked statistics derived in 
Section |3 have no environmental trends other than those 
which come from the dependence of halo abundances on 
environment. In essence, the halo model represents the lan- 
guage with which to describe the predictions of standard 
galaxy formation models. 

The extent to which simple halo-model calculations 
such as the one developed here are able to reproduce mea- 
surements of marked correlations in real data provides a 
test of the standard assumption that galaxy properties are 
more closely related to the formation histories of their par- 
ent haloes, rather than to additional environmental effects. 
Such tests are particularly interesting for two reasons. Re- 
cent work indicates that halo formation correlates with both 
mass and environment: at fixed mass, haloes in dense regions 
formed earlier (Sheth & Tormen 2004), although this effect is 
stronger for low mass haloes (Gao, Springel & White 2005). 
One might expect to see the results of this additional en- 
vironmental effect manifest in the galaxy distribution. Sec- 
ondly, current halo-model based interpretations of the lu- 
minosity dependence of clustering (e.g. Zehavi et al. 2005) 
implicitly assume that there are no environmental effects 
other than those which come from halo biasing. 
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The simplest halo model calculation (Section 13.211 as- 
sumes that, within a halo, the mark associated with a galaxy 
is independent of position, and of the marks of the other 
galaxies in the halo. Despite the extreme simplicity of this 
model, the resulting marked statistics show complex scale 
dependence, which is entirely due to the fact that most 
galaxy attributes are strong functions of the masses of the 
haloes which host them (Figures and |5J , and halo sizes 
and clustering depend on halo mass. 

Comparison between this simple model and measure- 
ments in the GIF semi-analytic galaxy formation model 
(Figures |3] and indicates that the assumptions which un- 
derlie the halo model description are an excellent approxi- 
mation on scales which are larger than the typical diameters 
of dark matter haloes. On these large scales, the statistics of 
marked pairs studied here can be thought of as measuring 
linearly biased versions of the dark matter power spectrum. 
Prescriptions for estimating these bias factors are given in 
Section |^ 

On smaller scales (those dominated by pairs in the same 
halo) , the halo-model calculation which assumes that marks 
do not correlate with position within the halo is inaccurate. 
This is because, in the GIF models, the central galaxy in a 
halo is different from all the others. A halo-model calculation 
which includes this effect was developed in Section 0] and 
shown to result in substantially better agreement with the 
measurements on small scales (Figures|3}0. This illustrates 
that marked statistics provide sharp tests of different phys- 
ical models of galaxy formation. A more general model for 
correlations between galaxy mark and position within the 
parent halo was developed, but not tested, in Section 14.21 
and a toy model illustrating the effects of mark gradients 
was outlined in Appendix A. 

Galaxies are almost certainly associated with appropri- 
ately selected subclumps within haloes (Gao et al. 2004; 
Zentner et al. 2005), and so mark gradients are almost cer- 
tainly associated with the formation (and tidal-stripping 
processes) histories of the subclumps which host galaxies. 
Sheth & Jain (2003) develop the formalism for incorporating 
halo substructure into the halo-model description of cluster- 
ing. Therefore, it is likely that incorporating marked statis- 
tics into that formalism will prove fruitful. Sheth, Abbas & 
Skibba (2004) describe how to do this for the weighted cor- 
relation function — the results presented here show that it is 
straightforward to extend their analysis to the other marked 
statistics. 

The simplest halo model calculations presented here in- 
dicate that the lowest order marked statistics A4i{r) and 
W{r) encode information about the mean correlation be- 
tween mark and halo-mass. (Note that this information is 
provided without ever actually dividing the galaxy distri- 
bution up into 'clusters'.) However, this quantity can be 
estimated from other methods. For instance, Zehavi et al. 
(2005) describe a halo-model interpretation of the luminos- 
ity dependence of clustering in the SDSS by studying clus- 
tering in subsamples defined by galaxy luminosity. Their 
analysis can be used to infer how the mean luminosity of 
a galaxy correlates with the mass of its host halo, pro- 
vided one assumes that the environment plays no additional 
role than through halo biasing. Therefore, insertion of their 
luminosity-mass correlations in the halo model description 
developed here represents a prediction for the shape of the 
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Figure 8. Marked statistics in which _ft'-band luminosity and 
stellar mass were used as marks. Symbols show measurements 
in the GIF semi-analytic galaxy formation models, and dotted 
curves show an estimate of the uncertainty on the measurement. 
Dashed and solid curves show the halo-model calculation for each 
statistic when A/star is the mark (the halo-model calculations for 
Lji are shown in Figure |3) The solid curves are the result of 
including the fact that the central galaxy in a halo is different 
from the others. 

luminosity-marked correlations in the SDSS. If this predic- 
tion agrees with the actual measurement of marked statistics 
in the SDSS, then this will provide strong empirical justifi- 
cation for the assumption that there are no additional envi- 
ronmental effects. This test is the subject of Skibba, Sheth 
& Connolly (2005). 

One may turn this statement around, and ask if the 
luminosity weighted marked statistics {Mi or W) provide 
any new information than one gets from analysis of the lu- 
minosity dependence of clustering. Clearly, marked statistics 
provide information about environmental effects which the 
other method does not. However, if there are indeed no ad- 
ditional environmental effects, then our analysis shows that 
the two methods provide equivalent information vis a vis 
correlations between marks and halo masses. Even in this 
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case, however, marked statistics are an attractive choice be- 
cause they are substantially simpler to estimate (random 
catalogs are unnecessary), and they do not require divi- 
sion of the catalog into small luminosity bins to calibrate 
the correlation between mark and halo mass, thus allowing 
a higher signal-to-noise measurement from a larger catalog 
(rather than from smaller subsamples split by the value of 
the mark). 

Finally, Figure |5] illustrates another way in which our 
analysis aids in understanding correlations between galaxy 
observables and environment. The Figure compares Mi and 
(1 + W)/{1 + ^) when the mark is K-hand luminosity (lower 
set of symbols in each panel) and when stellar mass (upper 
set of symbols) is the mark. The trends traced out by both 
marks are the same — close pairs are more luminous and have 
larger stellar masses — although the amplitude is larger when 
stellar mass is used as the mark. (To better show that these 
differences are significant, we have attached the error bars to 
each set of points, rather than using the same format as in 
the previous Figures.) The differences between the marked 
statistics suggest that close pairs have larger mass-to-light 
ratios. What causes this correlation? 

Our halo-model calculation when Mstars is the mark is 
shown as the smooth curve; the analogous calculation for Lk 
was shown in Figure m In both cases, the halo-model calcu- 
lation provides an excellent description of the statistics, at 
least on scales larger than 2/i~^Mpc for the weighted cor- 
relation function, and down to even smaller scales for Mi. 
This agreement shows that, although the mass-to-light ratio 
is higher in dense regions, this environmental dependence is 
entirely due to the individual correlations between halo mass 
and mass-to- light ratio (Figure i a^nd between halo mass 
and environment. It will be interesting to see if other corre- 
lations between observables (e.g., the Fundamental Plane) 
show environmental trends for similar reasons. 
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This Appendix illustrates the ideas presented in the main 
text using a toy model in which relatively transparent ana- 
lytic results can be derived, and whose ingredients are qual- 
itatively similar to the more exact calculations shown in the 
main text. Let 



p{r) 



exp(— r/rs) 
(r/rs)2 



(Al) 



denote the density run of the mass around a halo centre. 
The total mass associated with this profile is 



M = I Ar 4:%r^ p{r) = A-K rips 



(A2) 



The normalized Fourier transform of this profile is 



1 f , . 2 , ^ sin(fcr) arctan(fcrs) , . „, 

.(fc)^-yd.wp(o4^= fc,; (A3) 

The correlation function is proportional to the Fourier trans- 
form of the square of this quantity. 

Now suppose that the probability a galaxy lies at dis- 
tance r from the center is 



p{r) dr : 



47rr^ dr p{r) 
A-K r^ps 



■ exp(— r/rs) 



dr 



(A4) 



i.e., galaxies trace the mass. Further, suppose that objects 
which are more distant from the centre are more luminous: 
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L/Lt = r/rs for some constant L,. 
luminosities is 



Then the distribution of 



p{L) dL = p{r) dr = exp(— L/L* 



dL 



(A5) 



Thus, the mean luminosity is L*. This will be useful shortly. 

Note that both the density run p(r) and the luminosity dis- 
tribution p{L) have rather realistic shapes, so the results 
which follow should resemble the real-world at least quali- 
tatively. 

In this model, the run of the luminosity-weighted profile 



than it is outside. It is this color gradient which gives rise 
to the difference between the two luminosity-weighted cor- 
relation functions. This illustrates that marked correlation 
functions encode information about luminosity- and hence 
color-gradients. 



L{r) 



p{r) 



exp(— r/rs) 



L* ' (rjra) 
so the normalized Fourier transform of this profile is 



w(fc) = 



J" dr 47rr^ I/(r) p(r) sin(fcr)/fcr 
\ dr 47rr^ p(t) 



(A6) 



(A7) 



Hence, the luminosity-weighted correlation function is pro- 
portional to (7r/4) exp(— r/rs)/(27r^). Since w{kY /u{kY < 
1 for all fc, this shows that W{r) < ^(r) for all r. Hence, 
in this model with luminosity increasing with distance from 
halo center, the marked correlation function {1 + W)/{1 + ^) 
decreases with decreasing r. 

Note that although there is no scatter in luminosities at 
fixed r, there is scatter in L at fixed pair separation (the two 
members of a pair of fixed separation can come from different 
distances from the halo center). Thus, in this model, the 
variance in weights at fixed pair separation is non-trivial. 

Now suppose that we randomize the luminosities within 
each halo. This means that the total distribution of lumi- 
nosities is still p{L) — exp(— L/L»)/L*, but this distribution 
now represents the probability that a galaxy in the halo has 
luminosity L whatever its distance from the halo center. In 
this case, the result of weighting each galaxy by its luminos- 
ity does not yield an r dependent weight, so the weighted 
profile (L/Lt) p{r) = p{r), since the mean of the weights L 
is Lf Hence, in this case, w{k) = u{k): the weighted and 
unweighted correlation functions are equal. 

The calculation in the previous section assumes that 
there is no correlation between weight and position within 
the parent halo. Hence it can describe the marked statistics 
associated with the case when the luminosities have been 
randomized. In this case, the small-scale dependence of the 
marked correlation function is entirely a consequence of the 
fact that the mean weight may depend on halo mass. If 
there is some correlation between weight and galaxy posi- 
tion within the halo, then this will manifest as a discrepancy 
between the model and the actual measured marked corre- 
lation. 

As a specific example of why such a discrepancy may be 
interesting, suppose that the first case (weights increase with 
increasing distance from halo center) corresponds to the lu- 
minosity distribution in a blue band, say Lb, whereas the 
second case corresponds to the luminosity in a redder band, 
say Lr. The color is defined as c = Lr/Lb- For what follows, 
it is useful to define c, = Ljir/LB*- Since I/s is a determin- 
istic function of r, the color distribution at r is due to the 
distribution in Lr: p(cjr)dc = p{Lr = cLb{j-)) Lb{j-) dc = 
exp[— (c/c,)Z/s(r)/Z/st] (Ls(r)/Ls,) dc/c, so the mean 
color at r is c»LB*/LB{r) = Ct{rs/r). This shows that 
there is a color gradient: the halo is redder in the center 



